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Introduction 

The  alphaviruses  are  a  widespread  group  of  human  pathogens  that  are  present  in  many 
pans  of  the  world  (Griffin,  1986;  Monath,  1988;  Peters  and  Dalrymple,  1990).  They  are 
mosquito-borne  and  are  particularly  prevalent  in  tropical  and  subtropical  areas  of  the  world,  but 
alphaviruses  pathogenic  for  man  are  also  present  in  temperate  areas.  Many  alphaviruses  are 
capable  of  causing  fever,  rash  and  arthralgia  in  man  that  in  some  cases  may  be  disabling  for 
extended  periods  of  time.  Many  of  the  New  World  alphaviruses  can  cause  encephalitis  in  man. 
Our  program  attempts  to  understand  the  molecular  basis  of  alphavirus  immunogenicity  and  to 
determine  the  relationships  of  alphaviruses  and  strains  of  alphaviruses  to  one  another. 

We  reported  last  year  that  strains  of  Sindbis  virus  from  Northern  Europe  referred  to  as 
Ockelbo  virus  and  Karielian  fever  virus,  which  cause  an  illness  characterized  by  polyarthritis 
whose  symptoms  can  persist  for  months  or  years,  were  very  closely  related  to  pathogenic  strains 
of  Sindbis  virus  isolated  from  South  Africa  (Shirako  et  al.,  1991).  We  concluded  that  a  South 
African  strain  of  Sindbis  virus  was  introduced  into  Northern  Europe,  probably  in  the  1960’s, 
either  by  the  activities  of  man  or  by  migratory  birds,  and  this  led  to  epidemics  of  Ockelbo  disease 
in  Sweden.  The  virus  then  spread  to  Finland  and  the  Karelian  region  of  the  Soviet  Union, 
probably  in  the  1980's,  causing  epidemics  of  disease  called  Pogosta  disease  and  Karelian  fever, 
respectively.  We  also  found  that  repeated  sequence  elements  found  in  the  3'  nontranslated  region 
of  Sindbis  viruses  are  much  more  highly  conserved  than  sequences  outside  these  elements,  and 
concluded  that  these  repeated  elements  must  play  an  important  role  in  RNA  replication. 

In  the  past  year  we  have  continued  our  sequencing  efforts  on  alphaviruses  in  order  to 
determine  the  relations  of  these  viruses  to  one  another.  These  have  included  the  ncP3-nsP4 
regions  of  a  South  African  strain  of  Sindbis  virus  in  in  order  to  examine  the  relationships  within 
these  domains  between  South  African  Sindbis  viruses  and  the  Northern  European  Ockelbo  viruses 
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begun  last  year.  We  have  also  examined  Whataroa  virus,  a  virus  related  to  Sindbis  virus  isolated 
from  New  Zealand,  an  Indian  isolate  of  Sindbis  virus,  and  an  Australian  isolate  of  Sindbis  virus  in 
order  to  determine  the  relationships  of  these  viruses  to  one  another.  We  have  begun  sequencing  of 
Aura  virus  in  order  to  search  for  emergent  viruses.  We  had  previously  found  that  Western  equine 
encephalitis  virus,  found  throughout  North  and  South  America,  arose  by  recombination  between 
Eastern  equine  encephalitis  virus  and  a  New  World  alphavirus  related  to  Sindbis  virus.  Aura  virus 
has  been  isolated  in  Brazil  and  Northern  Argentina  and  is  known  from  serological  studies  to  be 
related  to  Sindbis  virus.  We  wished  to  determine  if  Aura  virus  was  the  second  parent  of  Western 
equine  encephalitis  virus. 

We  are  also  interested  in  the  localization  of  neutralizing  antibody-binding  sites  in 
alphaviruses.  The  knowledge  of  immunogenetic  domains  is  important  in  developing  vaccines. 
Neutralizing  antibodies  bind  to  the  glycoproteins  of  alphaviruses  and  prevent  them  from  attaching 
to  susceptible  cells  or  prevent  them  from  penetrating  cells.  The  exact  mechanisms  by  which 
neutralizing  antibodies  inactivate  a  virus  are  somewhat  controversial  and  differ  from  case  to  case, 
but  at  least  in  some  cases  the  antibody  neutralizes  by  binding  to  the  structure  on  the  surface  of  the 
virus  that  interacts  with  a  receptor  on  the  cell  surface,  thus  directly  blocking  the  virus  from 
interacting  with  its  receptor.  In  these  cases  anti-idiotypic  antibodies  made  against  such  antibodies 
may  function  as  anti-receptor  antibodies.  In  studies  of  antibody  escape  variants  we  have 
identified  domains  of  glycoprotein  E2  which  appear  to  be  important  for  virus  neutralization 
(Strauss  et  al.,  1991).  Here  we  report  that  we  have  been  able  to  use  Xgtl  1  expression  libraries  to 
directly  demonstrate  interaction  between  a  neutralizing  antibody  and  a  specific  domain  of 
glycoprotein  E2.  Such  a  result  is  significant  because  cases  have  been  described  in  which  resistance 
to  a  monoclonal  antibody  (mAb)  arose  from  single  amino  acid  substitutions  away  from  the  actual 
antibody  binding  site  (Diamond  et  al.,  1985;  Parry  et  al.,  1990).  Thus  it  is  possible  to  induce 
changes  in  conformation  of  the  antibody-binding  regions  with  amino  acid  substitutions  outside  the 
epitope,  and  direct  demonstration  of  antibody  binding  to  a  defined  region  is  important.  Because 
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this  neutralizing  monoclonal  antibody  used  here  elicits  production  of  anti-idiotypic  antibodies 
which  act  as  anti-receptor  antibodies  in  chicken  cells  (Wang  et  al.,  1991),  this  domain  is  also 
implicated  in  attachment  to  the  surface  of  a  susceptible  cell.  A  complete  description  of  these  results 
has  appeared  in  the  Journal  of  Virology  65, 7037-7040  (1991).  A  preprint  of  this  paper  entitled 
"Use  of  a  A.gtl  1  expression  library  to  localize  a  neutralizing  antibody-binding  site  in  glycoprotein 
E2  of  Sindbis  virus,"  by  K.  S.  Wang  and  J.  H.  Strauss,  was  submitted  to  the  U.S.  Army  Medical 
Research  and  Development  Command  at  the  time  of  submission  to  the  journal. 

Methods  Used 

Virus  Strains.  South  African  strains  of  Sindbis  virus,  Whataroa  virus,  Indian  and 
Australian  isolates  of  Sindbis  virus,  and  Aura  virus  were  obtained  from  Dr.  J.  M.  Dalrymple  of 
USAMRIID.  Viruses  were  grown  and  purified  as  previously  described  (Shirako  et  al.,  1991). 

cDNA  clones  for  most  of  the  viruses  were  produced  using  standard  methods  (Sambrook  et 
al.,  1989).  First  strand  cDNA  was  made  using  oligo(dT)  as  a  primer  and  second  strand  synthesis 
was  by  the  method  of  Gubler  and  Hoffman  (Gubler  and  Hoffman,  1983).  In  some  cases  Hindlll 
fragments  of  the  cDNA  were  cloned  into  vector  pGem3Z.  In  other  cases  £coRI  linkers  were  added 
to  double-stranded  cDNA  and  the  cDNA  cloned  into  the  EcoRl  site  of  pGem3Z.  DNA  sequencing 
and  RNA  sequencing  used  standard  technology  that  is  in  common  use  in  our  laboratory  (Hahn  et 
al.,  1989;  Rice  et  al.,  1985;  Rice  and  Strauss,  1981;  Shirako  et  al.,  1991;  Strauss  et  al.,  1984). 

Construction  of  a  Random  cDNA  Library  of  Virus  RNA.  We  have  also 
developed  methods  suitable  for  high  throughput  automated  DNA  sequencing  in  order  to  speed  up 
the  acquisition  of  sequence  data.  For  this  we  used  Whataroa  virus,  strain  M78,  isolated  in  1962  at 
Westland,  New  Zealand,  from  Culex  pervigilans,  as  a  test  virus.  The  virus  was  propagated  once 
in  primary  chicken  fibroblast  cells  and  purified  by  sucrose  gradient  centrifugation.  The  RNA  was 
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extracted  by  an  SDS/phenol  method,  precipitated  in  ethanol,  and  suspended  in  water.  First  strand 
cDNA  was  synthesized  with  2  pg  of  virus  RNA  using  200  pmol  of  pd(N)6  and  2  pmol  of  dTn 
by  AMV  reverse  transcriptase.  The  second  strand  DNA  was  synthesized  by  the  Gubler  and 
Hoffman  (Gubler  and  Hoffman,  1983)  method.  The  double-stranded  cDNA  was  blunt  ended  with 
T4  DNA  polymerase  in  the  presence  of  RNase  A,  extracted  with  phenol/chloroform,  and 
precipitated  with  ethanol.  After  methylating  internal  EcoRI  sites  with  EcoRI  methylase,  the  DNA 
was  electrophoresed  in  an  LMP  agarose  gel  and  a  2-4  kb  fraction  was  isolated  by  a  CTAB  method 
as  described  elsewhere  (Shirako  and  Strauss,  1992).  The  isolated  DNA  was  kinased  with  T4 
DNA  polynucleotide  kinase  and  ligated  to  kinased  EcoRI  linkers.  The  ligation  products  were 
digested  with  EcoRI,  extracted  with  phenol/chloroform,  precipitated  with  ethanol,  and 
electrophoresed  in  an  LMP  agarose  gel.  The  2-4  kb  fraction  was  isolated  by  a  CTAB  method  and 
ligated  to  an  EcdRI-digested,  CIAP-treated  pGEM3Z  vector.  The  ligated  DNA  was  transformed 
into  E.  coli  JM109.  One  hundred  clones  that  appeared  to  contain  inserts  were  selected  randomly 
and  characterized  by  restriction  analysis  of  the  DNA  prepared  from  0.5  ml  of  bacterial  cultures. 
Ninety-six  clones  were  were  found  to  contain  inserts  larger  than  1.0  kb.  Fifty  clones  containing 
larger  inserts  were  further  selected  and  the  DNA  was  prepared  from  10  ml  of  bacterial  cultures  by 
a  modified  boiling  method. 

Construction  and  Screening  of  the  Bacteriophage  Library.  Sindbis  virus  strain 
AR339,  from  A.  Schmaljohn  of  USAMRIID,  was  grown  in  monolayers  of  primary  chicken 
embryo  fibroblasts  (Pierce  et  al.,  1974).  Virus  was  purified  as  described  (Bell  et  al.,  1979), 
disrupted  with  0.5%  SDS,  and  49S  genomic  RNA  extracted  with  phenol/chloroform  (Hsu  et  al., 
1973).  After  two  ethanol  precipitations,  RNA  was  suspended  in  distilled  water  and  stored  at  -70°C 
until  use  as  a  template  for  cDNA  synthesis. 

A  A.gtl  1  library  containing  short  inserts  of  Sindbis  cDNA  was  constructed  by  a 
modification  of  the  procedure  of  Young  and  Davis  (Young  and  Davis,  1983).  cDNA  synthesis 
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was  randomly  primed  with  sonicated  salmon  testis  DNA;  [32P]dCTP  was  included  during  cDNA 
synthesis  in  order  to  monitor  the  product.  After  flush-ending  with  the  Klenow  fragment  of  DNA 
polymerase  I,  methylation  with  EcoRl  methyltransferase,  and  addition  of  EcoKl  linkers 
(Collaborative  Research),  the  modified  cDNA  was  digested  with  an  excess  of  EcoRl  restriction 
enzyme.  The  digested  cDNA  was  then  fractionated  on  a  Sephadex  CL-6B  column,  and  Sindbis 
cDNA  fragments  100-300  base  pairs  in  size  were  pooled  and  ligated  to  dephosphorylated  Xgtl  1 
arms  (Promega).  After  in  vitro  packaging  into  phage  heads  (Stratagene),  the  percentage  of  phage 
containing  Sindbis  virus  cDNA  inserts  was  found  to  be  90%  by  plating  phage  on  E.  coli  Y1090  in 
the  presence  of  5-bromo-4-chloro-3-indolyl  J3-D-galactoside.  Plaques  were  screened  for  reactivity 
with  the  various  mAbs.  Phage  plaques  were  grown  for  6  hrs  at  42°C,  nitrocellulose  disks 
(Schleicher  &  Schuell)  soaked  in  10  mM  isopropyl  thio-p-D-galactopyranoside  were  then  placed 
on  the  top  of  the  agar  layer,  and  the  plates  were  transferred  to  37°C  for  15  hrs.  The  filters  then 
were  lifted  and  washed  successively  in  10  mM  Tris-Cl  pH  7.5  and  150  mM  NaCl  containing  5% 
nonfat  milk.  The  filters  were  incubated  overnight  at  4°C  with  monoclonal  antibody  (10  pg/ml  in 
PBS  containing  5%  nonfat  milk),  washed,  125I-conjugated  protein  G  (0.5  pCi/ml  in  5%  nonfat 
milk)  added,  and  the  filters  were  incubated  for  at  least  2  hr  at  room  temperature.  After  washing 
and  drying,  the  filters  were  exposed  overnight  at  -80°C  to  Kodak-X-OMAT  film.  Immunoreactive 
phage  were  picked  and  rescreened  until  a  uniformly  reactive  population  was  obtained. 

Sequence  of  the  nsP3  and  nsP4  Region  of  Alphaviruses 

We  have  obtained  the  complete  sequence  of  the  nsP3-nsP4  region,  approximately  3.5  kb, 
for  four  new  alphaviruses.  These  are  a  South  African  strain  of  Sindbis  virus  isolated  from  a 
human  case  of  Sindbis  disease,  Whataroa  virus  from  New  Zealand,  an  Indian  isolate  of  Sindbis 
virus,  and  an  Australian  isolate  of  Sindbis  virus.  Information  on  these  strains  and  on  a  number  of 
other  strains  with  which  we  are  currently  working,  is  given  in  Fig.  1.  Shown  is  the  name  of  the 
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Figure  1  Strains  of  Sindbis  virus  used  in  this  study 


Page  11 


strain,  the  source  from  which  the  virus  was  isolated,  the  year  and  place  of  isolation,  and  the  status 
of  our  work  with  the  virus. 

The  four  new  sequences  obtained  are  presented  in  Figs.  2  to  5.  These  nucleotide 
sequences  and  the  amino  acid  sequences  deduced  from  them  illustrate  the  close  relationships 
among  these  alphaviruses  and  confirm  that  South  African  strains  of  Sindbis  virus  are  very  closely 
related  to  Ockelbo  virus  and  its  allies.  nsP4  in  particular  is  very  highly  conserved.  The  C-terminal 
domain  of  nsP3,  which  is  not  highly  conserved  among  alphaviruses,  shows  more  variability,  but 
in  each  case  there  is  an  opal  termination  codon  between  nsP3  and  the  beginning  of  nsP4  which 
must  be  read  through  in  order  to  produce  nsP4. 

The  relationships  among  these  viruses  are  illustrated  in  numerical  fashion  in  Fig.  6.  South 
African  Girdwood  and  Ockelbo  exhibit  only  1.3%  sequence  divergence  in  nsP4  and  only  1.8% 
divergence  in  the  conserved  region  of  nsP3.  The  Indian  and  Australian  isolates  have  diverged  by 
7-10%  from  these  strains  in  nsP3  and  nsP4.  Whataroa  virus  is  clearly  related  to  these  Sindbis 
viruses  but  differs  by  12-16%  in  amino  acid  sequence  in  these  regions  from  the  Sindbis  virus 
strains. 


High  Throughput  Automated  DNA  Sequencing 

Several  companies,  including  Applied  Biosystems,  now  make  automated  DNA  sequencers 
which  can  greatly  speed  up  the  rate  of  acquisition  of  sequence  data.  In  order  to  use  such  a  system, 
random  cDNA  clones  must  constructed  which  represent  the  entire  viral  genome  and  DNA  must  be 
prepared  from  such  clones  that  is  highly  purified  and  suitable  for  automated  sequencing.  We  have 
shown  that  it  is  feasible  to  use  the  Applied  Biosystems  sequenator  to  sequence  alphaviruses  by 
using  Whateroa  virus  as  a  test  virus.  Random  cDNA  clones  were  constructed  in  a  plasmid  vector 
and  plasmid  DNA  was  subjected  to  high  throughput  automated  DNA  sequencing.  Preparation  of 
plasmid  cDNA  libraries  containing  a  representative  sampling  of  the  Whateroa  genome  required 


Figure  2.  nsP3/nsP4  of  A1036  (1953,  f  ncU a* ^dellonyssus  bursa) 


1  GCUCCGGCCUAUCGCUCGAAACGUGAGAACAUCGCCGAGUGCCUCGAAGAGGCCGUAGUU  60 

APAYRSKRENIAECLEEAVV 

61  AAUGCCGCGAAUGCACUCGGACGGCCGGGCGAAGGGGUAUGCAAAGCCAUAUAUAAAAAA  120 

NAANALGRPGEGVCKAIYKK 
•  ••••• 

121  UGGCCUAAUAGUUUCGUCGAUUCCGCGACAGAGACUGGAACGGCUAAGCUAGUGUGCUGU  180 

WPNSFVDSATETGTAKLVCC 

181  CAAGGAAAGAAAAUUAUCCACGCCGUCGGACCCGACUUCCGCAAACACUCCGAGGCAGAA  240 

QGKKIIHAVGPDFRKHSEAE 

241  GCACUGAAGAUUCUCCAGAACACAUACCACGCCAUAGCAGAUUUGGUUAACAAACAUGGA  300 

ALKILQNTYHAIADLVNKHG 

301  AUCAAGACUGUAGCGAUCCCGCUACUAUCCACCGGGAUUUACGCAGCGGGAAAAGACAGA  360 

IKTVAIPLLSTGIYAAGKDR 

361  CUCGAGGUCUCCUUAAACUGUCUUACCACCGCCCUGGACAGAACAGACGCAGACGUCACA  420 

LEV  SLNCLTTALDRTDADVT 

421  AUCUACUGUCUAGACAAAAAAUGGAAAGAAAGGAUCGAUGCGGUUAUACAAUUGAAGGAG  480 

IYCLDKKWKERIDAVIQLKE 

481  UCGGUGACGGAACUGAAGGAUGAGGAUAUGGAGAUCGACGAUGAGUUAGUAUGGAUCCAC  540 

SVTELKDEDMEIDDELVWIH 

541  CCGGAUAGUUGUCUCAAGGGCAGGAAAGGGUAUAGCACAACAAAAGGUAAACUUUAUUCG  600 

PDSCLKGRKGY  STTKGKLY  S 

601  UACUUUGAGGGGACUAAGUUUCAUCAGGCAGCAAAAGACAUGGCGGAGAUUAAAGUACUU  660 

YFEGTKFHQAAKDMAEIKVL 

661  UUUCCCGAUGAGCAAGAGUGCAACGAGCAGUUGUGUGCAUACAUCCUUGGUGAAACCAUG  720 

FPDEQECNEQLCAYILGETM 

721  GAAGCCAUCAGGGAAAAAUGUCCAGUGGACUUUAAUCCGUCGUCCAGUCCGCCGAAGACA  780 

EAIREKCPVDFNPSSSPPKT 

781  CUCCCCUGUUUGUGCAUGUAUGCCAUGACGCCUGAGAGAGUGCACCGUCUGCGUAGCAAC  840 

LPCLCMYAMTPERVHRLRSN 

841  AACGUCAAGUCCAUCACAGUGUGUUCGUCUACCCCACUUCCGAAGCACAAGAUCAAGAAC  900 

NVKSITVCSSTPLPKHKIKN 

901  GUUCAGAAAGUACAGUGCACGAAAGUGGUCUUGUUCAAUCCACAGACCCCUGAAUUUGUC  960 

VQKVQCTKVVLFNPQTPEFV 

961  CCUGCCCGUAAGUACAUAGAAGCACAACCAAAAGACGUAAGCCAAGAUGCAGAAGAAAGC  1020 

PARKYIEAQPKDVSQDAEES 

1021  CCUGCCGCAGCCGCCCGAGAUAACACCUCACGGGACGUAACAGACAUAUCCCUGGAUGUG  1080 

PAAAARDNTSRDVTDISLDV 

1081  GAAGAAAGUCAAGCCGCAGCCGGCCAACCAGAGGAGCGCUCGGGGGACAACACUUCCCGG  1140 

EESQAAAGQPEERSGDNTSR 

1141  GAUGUAACAGAUAUAUCCCUAGAUCACGACAGCGAUAGUGAGGUGGGCUCCAUCUUCUCU  1200 

DVTDISLDHDSDSEVGSIFS 

1201  AACCUUAGCUGCUCCAGUCAAUCC AUCACUAGUAUGGACAGCUGGUCCUCCGGACCGGGA  1260 

NLSCSSQSITSMDSWSSGPG 
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Figure  2.  nsP3/nsP4  of  A1036  (1953,  India,  Bdellonyssus  bursa) 


1261  UCGAUCACGAUAAACGAGAACCGCACCAUUCAGGUCACGGCGGAGAUACACAAUGCUCCU 
SITINENRTIQVTAEIHNAP 

1321  GCCGCGUUGCCUGUUCCACCACCACGCCUUAAGAAACUGGCACGCUUAGCAGCCCAGAAG 
AALPVPPPRLKKLARLAAQK 

1381  CCCAAUCCGCCAUCCGACCCGCCUUCGACGGUCGAGGACGUGUCGAUGCGCUUGUCCUUC 
PNPPSDPPSTVEDVSMRLSF 

1441  CCUGCCACGGUGUCGUUCGGAUCAUUCUCCGACGGAGAAGUCGACGACCUUAGCCGCGAU 
PATVSFGSFSDGEVDDLSRD 

1501  AAAGCAGUGUCAGAACCGGUGGUCUUUGGUGCUUUCGAGCCUGGAGAGGUAACCUCUAUC 
KAVSEPVVFGAFEPGEVTSI 

1561  AUCGAAUCAAGGUCUGUCGUGUCAUUCCCCGUGCAUAAACGCCGGCGCAGAAGACGGGGC 
IESRSVVSFPVHKRRRRRRG 

1621  AAAAGAACCGAAUAUUGACUAACCGGGGUAGGUGGGUACAUCUUCUCAACUGACACGGGA 
KRTEY*LTGVGGYIFSTDTG 

1681  CCGGGCCACCUCCAGAAGAAGUCAGUUCUGCAAAACCAGCUUACUGAACCGACCCUCGAG 
PGHLQKKSVLQNQLTEPTLE 

1741  CGCAAUCAAUUAGAACGAAUGUAUGCGCCCAGUCUCGAUGUCAAGAAAGAGGAACUUCUG 
RNQLERMYAPSLDVKKEELL 

1801  AAACUUAAGUACCAAAUGAUGCCCACCGAAGCCAAUAAAAGUAGGUACCAGUCUAGAAAG 
KLKYQMMPTEANKSRYQSRK 

1861  GUUGAAAAUCAAAAAGCGGUAACCACCGAGAGGUUACUGUCGGGACUGAAGAUGUACAUC 
VENQKAVTTERLLSGLKMYI 

1921  CACUCAGAGAACCAACCUGAGUGUUAUAAGGUCACUUAUCCGAAACCGUCGUACUCCAGC 
HSENQPECYKVTYPKPSYSS 

1 981  AGUGUCCCUCUUAGUUACCAGAACCCUGAAUUCGCCGUAGCUGUUUGCAAUAACUACCUG 
SVPLSYQNPEFAVAVCNNYL 

2041  CAUGAGAACUACCCGACGGUUGCCUCCUAUCAGAUUACGGACGAAUAUGAUGCCUACCUC 
HENYPTVASYQITDEYDAYL 

2101  GACAUGGUGGACGGCACUGUUGCGUGUCUCGAC ACUGCAACAUUCUGCCCUGCGAAAUUA 
DMVDGTVACLDTATFCPAKIi 

2161  CGUAGCUUUCCGAAGAAACAUGAGUACCGCGCACCUAACAUCAGGAGUGCCGUGCCGUCU 
RSFPKKHEYRAPNIRSAVPS 

2221  GCUAUGCAGAAC ACUCUACAGAACGUCCUGAAUGCAGCAAC AAAGAGGAAUUGCAACGUU 
AMQNTLQNVLNAATKRNCNV 

2281  ACUCAGAUGAGAGAACUACCGACCCUAGACUCCGCGACCUUUAACGUGGAAUGCUUCCGA 
TQMRELPTLDSATFNVECFR 

2341  AAGUACGCGUGCAAUGACGAGUAUUG.  GCUGAAUUCUCCGAAAAACCAAUCAGGAUCACC 
KYACNDEYWAEFSEKPIRIT 

2401  ACGGAGUUUGUUACGGCGUACGUGGCGAGAUUGAAGGGACCAAAGGCUGCUGCUCUGUUU 
TEFVTAYVARLKGPKAAALF 

GCAA.VACGCAUAACCUAGUCCCAUUGCAAGAAGUACCUAUGGACAGGUUUGUGAUGGAC 
AKTHNLVPLQEVPMDRFVMD 
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Figure  2.  nsP3/nsP4  of  A1036  (1953,  India,  Bdellonyssus  bursa) 


2521  AUGAAGCGAGAUGUCAAGGUGACUCCGGGCACAAAACACACCGAAGAAAGGCCUAAGGUG  2580 

MKRDVKVTPGTKHTEERPKV 

2581  CACGUAAUCCAAGCGGCUGAGCCUUUUGCUACAGCCUACCUUUGUGGCAUCCACCGAGAG  2640 

QVIQAAEPFATAYLCGIHRE 

2641  CUGGUACGCCGGCUUACCGCGGUUCUACUCCCGAACGUACACACCCUGUUUGACAUGUCU  2700 

LVRRLTAVLLPNVHTLFDMS 

2701  GCGGAGGAUUUCGACGCGAUUAUUGCCGAGCAUUUCCGACAAGGUGACGCCGUGCUCGAG  2760 

AEDFDAI  IAEHFRQGDAVLE 

2761  ACAGACAUCGCGUCAUUCGAUAAGAGUCAGGACGAUGCGAUGGCCCUGACUGGGCUGAUG  2820 

TDIASFDKSQDDAMALTGLM 

2821  AUCCUGGAGGACCUCGGCGUCGAUCAACCGCUGCUGGACCUCAUCGAGUGUGCCUUCGGA  2880 

ILEDLGVDQPLLDLIECAFG 

2881  GAAAUAUCAUCUACGCAUCUGCCUACUGGGACACGGUUUAAGUUCGGCUCAAUGAUGAAA  2940 

EISSTHLPTGTRFKFGSMMK 

2941  UCCGGAAUGUUUCUUACGCUCUUCGUGAACACCAUCUUGAAUGUCGUGAUCGCUAGUCGC  3000 

SGMFLTLFVNTILNVVIASR 

3001  GUGCUUGAGCACAGGUUAACAGGAUCACGAUGUGCCGCAUUCAUUGGAGACGAUAACAUC  3060 

VLEHRLTGSRCAAFIGDDNI 

3061  AUCCACGGCGUGGUAUCAGACAAGGAAAUGGCCGAAAGGUGCGCCACUUGGCUGAAUAUG  3120 

I  HGVVSDKEMAERCATWLNM 

3121  GAGGUAAAAAUCAUUGACGCGGUGAUCGGCGAGCGUCCUCCGUAUUUCUGUGGUGGCUUU  3180 

EVKIIDAVIGERPPYFCGGF 

3181  AUACUACAGGACUCUGUCACCCAAACAGCCUGUCGAGUGGCUGACCCCCUAAAAAGACUG  3240 

ILQDSVTQTACRVADPLKRL 

3241  UUCAAGCUAGGAAAACCUUUGCCCGCAGAUGAUGACCAAGAUGAAGACAGAAGAAGGGCU  3300 

FKLGKPLPADDDQDEDRRRA 

3301  UUGCUGGAUGAGACUAAGGCGUGGUUUAGAGUGGGCAUAACCGAAACAUUGGCUACUGCG  3360 

LLDETKAWFRVGI  TETLATA 

3361  GUAGCAACGCGGUACGAAGUUGAUAACAUCACGCCUGUCCUGCUGGCACUGAGGACCCUU  3420 

VATRYEVDNI  TPVLLALRTL 

3421  GCGCAAAGCAAGAGAUCCUUUCAGUCCAUAAGAGGGGAAAUGAAGCAUCUCUACGGUGGU  3480 

AQSKRSFQSIRGEMKHLYGG 

3481  CCUAAAUAG  3489 

P  K  * 

Figure  2.  Translated  sequence  of  the  nsP3-nsP4  region  of  Sindbis  strain  A1036, 
using  the  single  letter  amino  acid  code.  nsP3  and  nsP4  are  translated  as  part  of  a 
polyprotein  encoded  by  nucleotides  (nts)  4100  to  7600  in  the  type  virus  genome.  In 
this  and  the  following  3  figures,  nts  are  numbered  from  the  amino  terminus  of 
nsP3.  The  star  at  nt  1636  indicates  the  opal  codon  separating  nsP3  and  nsP4;  the 
star  at  nt  3489  is  the  termination  codon  of  nsP4.  The  amino  terminal  residue  of 
processed  nsP4  is  Tyr  (Y)  encoded  by  nts  1657-1659. 
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Figure  3.  nsP3/nsP4  of  MRM18520  (1975,  Australia,  mosquito  pool) 


1  GCUCCGGCCUACCGCUCGAAACGUGAGAAUAUCGCCGAAUGCCUUGAAGAGGCCGUAGUU  60 
APAYRSKRENIAECLEEAVV 

61  AACGCCGCGAACCCACUCGGACGUCCGGGCGAAGGGGUGUGUAAAGCCAUAUAUAAAAAA  120 

NAANPLGRPGEGVCKAIYKK 

121  UGGCCCAAUAGUUUUGUCGAUUCUGCGACAGAGACUGGAACAGCUAAGCUAGUGUGCUGU  180 

WPNSFVDSATETGTAKLVCC 

181  CAAGGAAAAAAGAUUAUCCAUGCCGUCGGACCUGACUUCCGUAAACACCCCGAGGCAGAA  240 

QGKKIIHAVGPDFRKHPEAE 

241  GCGCUGAAGAUUCUCCAGAACACAUACCACGCCAUCGCAGAUUUGGUUAACAAACAUGGA  300 

ALKILQNTYHAIADLVNKHG 

301  AUCAAGACCGUAGCGAUCCCGCUUCUAUCCACCGGGAUUUACGCAGCGGGAAAAGACAGA  360 

IKTVAIPLLSTGIYAAGKDR 

361  CUUGAGGUCUCUUUAAACUGCCUCACUACCGCCCUGGACAGAACUGACGCAGACGUCACA  420 

LEVSLNCLTTALDRTDADVT 

421  AUCUACUGCCUUGACAAAAAAUGGAAAGAACGGAUUGAUGCGUUUAUACAGUUGAAGGAG  480 

IY  CLDKKWKERI  DAF  IQLKE 

481  UCGGUGACGGAACUGAAGGAUGAUGACAUGGAGAUCGACGACGAAUUAGUAUGGAUCCAC  540 

SVTELKDDDMEIDDELVWIH 

541  CCGGAUAGUUGCCUCAAGGGUAGGAAAGGGUUUAGUACGACGAAGGGCAAGCUCUACUCG  600 

PDSCLKGRKGFSTTKGKLY  S 

601  UACUUUGAGGGGACUAAAUUUCAUCAAGCAGCAAAAGACAUGGCUGAGAUCAAGGUACUU  660 

YFEGTKFHQAAKDMAEIKVL 

661  UUUCCCGAUGAGCAAGAGUGCAACGAGCAACUGUGUGCAUACAUUCUAGGCGAAACCAUG  720 

FPDEQECNEQLCAYILGETM 

721  GAAGCCAUCAGGGAAAAAUGUCCAGUGGACUUUAAUCCGUCGUCCAGUCCGCCGAAGACG  780 

EAIREKCPVDFNPSSSPPKT 

781  CUUCCCUGUUUGUGUAUGUACGCCAUGACGCCCGAGAGAGUGCACCGCUUGCGUAGCAAU  840 

LPCLCMYAMTPERVHRLRSN 

841  AACGUCAAAUCCAUCACAGUAUGCUCGUCAACCCCGCUUCCGAAGCACAAAAUUAAGAAC  900 

NVKSITVCSSTPLPKHKIKN 

901  GUUCAGAAAGUACAGUGCACGAAAGUAGUCCUAUUCAACCCACAAACGCCUGAAUUUGUC  960 

VQKVQCTKVVLFNPQTPEFV 

961  CCUGCCCGCAAGUACAUAGAAACACAACCGAAGGACGACAGUCAAGAGGCGGAAGAAAAC  1020 

PARKYIETQPKDDSQEAEEN 

1021  CCUGCCGCAGCCGAUAACACUUCACGGGAUGUAACAGACGUAUCUCUAGAUGUGGAAGGA  1080 

PAAADNTSRDVTDVSLDVEG 

1081  GAUCGCGUUGCGGCCAACCGAUCAGAGGUGCACUCAGAGGACAACACCUCCCGAGAUGUA  1140 

DRVAANRSEVHSEDNTSRDV 

1141  ACAGACAUAUCUCUAGACCACAACAGUGAUAGCGAGGUGGGCUCCAUUUUCUCUGACCUC  1200 

TDISLDHNSDSEVGSIFSDL 

1201  AGCUGCUCCAGUCAUUCCAUCACCAGCAUGGACAGCUGGUCCUCCGGACCGAGCUCGAUC  1260 

SCSSHSITSMDSWSSGPSSI 
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Figure  3.  nsP3/nsP4  of  MRM18520  (1975,  Australia,  mosquito  pool) 


1261  AUGCUAAACGGGAAUCACACCAUCCAGGUCACGGCAGAGAUACACAACGCUCCUGCUGCA  1320 
MLNGNHTIQVTAEIHNAPAA 

1321  CCGCCCGUACCACCACCACGCCUCAAGAAACUGGCGCGCUUGGCAGCUCAGAAGUCCGAU  1380 

PPVPPPRLKKLARLAAQKSD 

1381  CCGCCAUCCAGCCCGCCCUCAACGGUUGAGGACGUGUCGAUGCGCCUGUCAUUCCCUGCC  1440 

PPSSPPSTVEDVSMRLSFPA 

1441  ACGGUGUCAUUCGGAUCUUUUUCUGACGGCGAAGUCGACGAUCUUAGUCGCGAAAAAGCA  1500 

TVSFGSFSDGEVDDLSREKA 

1501  GUGUCAGAACCAGUGGUCUUUGGUGCUUUCGAGCCAGGAGAGGUAACAUCUAUCAUUGAA  1560 

VSEPVVFGAFEPGEVTSIIE 

1561  GCAAGGUCUGUCGUGUCAUUCCCCGUGAAUAAACGCCGGCGCAGGAGACGGGGCCAAAAG  1620 

ARSVVSFPVNKRRRRRRGQK 

1621  AAAACCGAAUAUUGACUAACCGGGGUAGGUGGGUAUAUCUUCUCGACUGACACGGGACCG  1680 

KTEY*LTGVGGYIFSTDTGP 

1681  GGUCACCUCCAGAAAAAAUCGGUUCUACAAAACCAGCUUACGGAACCGACCCUCGAGCGU  1740 

GHLQKKSVLQNQLTEPTLER 

1741  AAUCAAUUAGAACGAGUGUAUGCACCCAGUCUUGAUGCCAAGAAAGAGGAACUCUUGAAA  1800 

NQLERVYAPSLDAKKEELLK 

1801  CUCAAGUACCAAAUGAUGCCCACCGAAGCCAAUAAAAGUAGGUACCAGUCUAGAAAGGUA  1860 

LKYQMMPTEANKSRYQSRKV 

1861  GAAAACCAAAAAGCCGUAACCACCGAGAGGUUACUGUCGGGAUUGAAGAUGUACAUUCAC  1920 

ENQKAVTTERLLSGLKMYI  H 

1921  UCAGAGAACCAACCCGAGUGUUACAAGGUCACCUAUCCGAAACCGUCGUACUCUAGCAGU  1980 

SENQPECYKVTYPKPSYSSS 

1981  GUUCCCCUUAGUUACCAGAGCCCCGAAUUCGCCGUAGCCGUCUGCAAUAACUACCUGCAU  2040 

VPLSYQSPEFAVAVCNNYLH 

2041  GAGAAUUAUCCAACGGUUGCCUCCUAUCAGAUUACGGAUGAAUAUGACGCCUACCUUGAC  2100 

ENYPTVASYQITDEYDAYLD 

2101  AUGGUGGACGGCACCGUAGCGUGUCUCGACACCGCUACAUUUUGCCCCGCGAAAUUACGC  2160 

MVDGTVACLDTATFCPAKLR 

2161  AGCUUCCCGAAGAAACACGAGUACCGAGAACCUAACAUCAGGAGCGCCGUACCGUCCGCU  2220 

SFPKKHEYREPNIRSAVPSA 

2221  AUGCAGAACACUCUACAGAACGUCCUGAACGCAGCAACAAAGAGGAAUUGCAAUGUUACU  2280 

MQNTLQNVLNAATKRNCNVT 

2281  CAGAUGAGAGAACUACCGACUUUAGACUCCGCAACCUUUAAUGUGGAAUGCUUUCGAAAG  2340 

QMRELPTLDSATFNVECFRK 

2341  UACGCGUGCAACGACGAGUAUUGGGCUGAAUUCUCCGAAAAACCAAUUAGGAUCACCACA  2400 

YACNDEYWAEFSEKPIRITT 

2401  GAGUUUGUCACGGCGUACGUGGCGAGAUUGAAGGGACCAAAGGCUGCUGCACUGUUUGCU  2460 

EFVTAYVARLKGPKAAALFA 

2461  AAAACGCAUAACCUAGUCCCACUGCAAGAAGUACCUAUGGACAGGUUUGUGAUGGACAUG  2520 

KTHNLVPLQEVPMDRFVMDM 
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Figure  3.  nsP3/nsP4  of  MRM18520  (1975,  Australia,  mosquito  pool) 


2521  AAGCGAGACGUUAAGGUGACUCCGGGCACGAAGCACACCGAAGAAAGACCCAAAGUGCAG  2580 

KRDVKVTPGTKHTEERPKVQ 

2581  GUAAUCCAAGCGGCAGAGCCUCUAGCUACAGCCUAUUUAUGCGGCAUCCACCGUGAGCUG  2640 

VIQAAEPLATAYLCGIHREL 

2641  GUACGCAGGCUUACCGCAGUCCUGCUUCCGAACGUACACACCCUUUUUGAUAUGUCUGCG  2700 

VRRLTAVLLPNVHTLFDMSA 

2701  GAAGAUUUCGAUGCUAUCAUUGCCGAGCAUUUUCACCAGGGUGACGCUGUGCUCGAGACA  2760 

EDFDAI  IAEHFHQGDAVLET 

2761  GACAUCGCGUCGUUCGAUAAGAGCCAAGACGAUGCGAUGGCCCUGACGGGGCUGAUGAUC  2820 

DIASFDKSQDDAMALTGLMI 

2821  CUGGAGGACCUCGGAGUCGACCAGCCAUUGCUGGACCUCAUCGAGUGCGCCUUCGGGGAA  2880 

LEDLGVDQPLLDLIECAFGE 

2881  AUAUCAUCUACGCACCUGCCGACCGGGACACGGUUUAAGUUCGGCUCAAUGAUGAAAUCC  2940 

I  SSTHLPTGTRFKFGSMMKS 

2941  GGAAUGUUCCUCACGCUCUUUGUGAACACCAUCUUGAAUGUCGUGAUAGCUAGUCGCGUG  3000 

GMFLTLFVNTILNVVIASRV 

3001  CUCGAGCACAGGUUAGCAGAAUCACGAUGCGCCGCAUUCAUCGGAGACGACAAUAUUAUU  3060 

LEHRLAESRCAAFIGDDNII 

3061  CACGGCGUGGUAUCCGACAAAGAAAUGGCUGAAAGGUGCGCCACUUGGCUGAAUAUGGAG  3120 

HGVVSDKEMAERCATWLNME 

3121  GUAAAAAUUAUUGACGCAGUAAUUGGCGAACGUCCUCCGUACUUCUGUGGCGGCUUUAUA  3180 

VKIIDAVIGERPPYFCGGFI 

3181  CUGCAGGACUCAGUCACCCAAACAGCCUGCCGAGUGGCGGACCCCCUAAAAAGAUUGUUC  3240 

LQDSVTQTACRVADPLKRLF 

3241  AAAUUAGGAAAACCAUUACCUGCAGAUGAUGACCAAGAUGAAGACAGAAGAAGGGCUCUG  3300 

KLGKPLPADDDQDEDRRRAL 

3301  CUGGAUGAGACCAAGGCGUGGUUUAGAGUGGGCAUAACUGAGACACUGGCUACUGCGGUA  3360 

LDETKAWFRVGITETLATAV 

3361  GCAACGCGGUAUGAAGUUGAUAACAUCACACCGGUCCUGCUGGCACUGAGGACCCUUGCG  3420 

ATRYEVDNITPVLLALRTLA 

3421  CAAAGCAAGAGAUCUUUUCAGGCCAUAAGGGGGAAAAUGAAGCAUCUCUACGGUGGUCCU  3480 

QSKRSFQAIRGKMKHLYGGP 

3481  AAAUAG  3486 

K  * 


Figure  3.  Translated  sequence  of  the  nsP3-nsP4  region  of  the  MRM18520  strain  of 
Sindbis  from  Australia.  Conventions  are  the  same  as  in  Figure  2. 
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Figure  4.  nsP3/nsP4  of  Girdwood  (1963,  South  Africa,  human) 


1  GCACCGUCAUACCGCACUAAAAGGGAGAACAUUGCUGAUUGUCAAGAGGAAGCAGUUGUC  60 
APSYRTKRENIADCQEEAVV 

61  AAUGCAGCCAAUCCGCUGGGCAGACCAGGCGAAGGAGUCUGCCGUGCCAUCUAUAAACGU  120 

NAANPLGRPGEGVCRAIYKR 

121  UGGCCGAACAGUUUCACCGAUUCAGCCACAGAGACCGGCACCGCAAAACUGACUGUGUGC  180 

WPNSFTDSATETGTAKLTVC 

181  CAAGGAAAGAAAGUGAUCCACGCGGUUGGCCCUGAUUUCCGGAAACACCCAGAGGCAGAA  240 

QGKKVIHAVGPDFRKHPEAE 

241  GCCCUGAAAUUGCUGCAAAACGCCUACCAUGCAGUGGCAGACUUAGUAAAUGAACAUAAU  300 

ALKLLQNAYHAVADLVNEHN 

301  AUCAAGUCUGUCGCCAUCCCACUGCUAUCUACAGGCAUUUACGCAGCCGGAAAAGACCGC  360 

IKSVAIPLLSTGIYAAGKDR 

361  CUUGAAGUAUCACUUAACUGCUUGACAACCGCGCUAGAUAGAACUGAUGCGGACGUAACC  420 

LEVSLNCLTTALDRTDADVT 

421  AUCUACUGCCUGGAUAAGAAGUGGAAGGAAAGAAUCGACGCGGUGCUCCAACUUAAGGAG  480 

IYCLDKKWKERIDAVLQLKE 

481  UCUGUAACAGAGCUGAAGGAUGAGGAUAUGGAGAUCGACGACGAGUUAGUAUGGAUCCAU  540 

SVTELKDEDMEIDDELVWIH 

541  CCGGACAGUUGCCUGAAGGGAAGAAAGGGAUUCAGUACUACAAAAGGAAAGUUGUAUUCG  600 

PDSCLKGRKGFSTTKGKLYS 

601  UACUUUGAAGGCACCAAAUUCCAUCAAGCAGCAAAAGAUAUGGCGGAGAUAAAGGUCCUG  660 

YFEGTKFHQAAKDMAEIKVL 

661  UUCCCAAAUGACCAGGAAAGCAACGAGCAACUGUGUGCCUACAUAUUGGGGGAGACCAUG  720 

FPNDQESNEQLCAYILGETM 

721  GAAGCAAUCCGCGAAAAAUGCCCGGUCGACCACAACCCGUCGUCUAGCCCGCCAAAAACG  780 

EAIREKCPVDHNPSSSPPKT 

781  CUGCCGUGCCUCUGCAUGUAUGCCAUGACGCCAGAAAGGGUCCACAGACUCAGAAGCAAC  840 

LPCLCMYAMTPERVHRLRSN 

841  AACGUCAAAGAAGUUACAGUAUGCUCCUCCACCCCCCUUCCAAAGUACAAAAUCAAGAAC  900 

NVKEVTVCSSTPLPKYKIKN 

901  GUUCAGAAGGUUCAGUGCACAAAA.GUAGUCCUGUUUAACCCGCAUACCCCUGCAUUCGUU  960 

VQKVQCTKVVLFNPHTPAFV 

961  CCCGCCCGUAAGUACAUAGAAGCGCCAGAACAGCCUGCAGCUCCGCCUGCACAGGCCGAG  1020 

PARKYIEAPEQPAAPPAQAE 

1021  GAGGCCCCCGAAGUUGCAGCAACACCAACACCACCUGCAGCUGAUAACACCUCGCUUGAU  1080 

EAPEVAATPTPPAADNTSLD 

1081  GUCACGGACAUCUCACUGGACAUGGAAGACAGUAGCGAAGGCUCACUCUUUUCGAGCUUU  1140 

VTDISLDMEDSSEGSLFSSF 

1141  AGCGGAUCGGACAACUCUAUUACUAGUAUGGACAGUUGGUCGUCAGGACCUAGUUCACUA  1200 

SGSDNSITSMDSWSSGPSSL 

1201  GAGAUAGUAGACCGAAGGCAGGUGGUGGUGGCUGACGUCCAUGCCGUCCAAGAGCCUGCC  1260 
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Figure  4.  nsP3/nsP4  of  Girdwood  (1963,  South  Africa,  human) 


EIVDRRQVVVADVHAVQEPA 

1261  CCUGUUCCACCGCCAAGGCUAAAGAAGAUGGCCCGCCUGGCAGCGGCAAGAAUGCAGGAA 
PVPPPRLKKMARLAAARMQE 

1321  GAGCCAACUCCACCGGCAAGCACCAGCUCUGCGGACGAGUCCCUUCACCUUUCUUUUGGU 
EPTPPASTSSADESLHLSFG 

1381  GGGGUAUCCAUGUCCUUCGGAUCCCUUUUCGACGGAGAGAUGGCCCGCUUGGCAGCGGCA 
GVSMSFGSLFDGEMARLAAA 

1441  CAACCCCCGGCAAGUACAUGCCCUACGGAUGUGCCUAUGUCUUUCGGAUCGUUUUCCGAC 
QPPASTCPTDVPMSFGSF  SD 

1501  GGAGAGAUUGAGGAGCUGAGCCGCAGAGUAACCGAGUCUGAGCCCGUCCUGUUUGGGUCA 
GE1EELSRRVTESEPVLFGS 

1561  UUUGAACCGGGCGAAGUGAACUCAAUUAUAUCGUCCCGAUCAGCCGUAUCUUUUCCACCA 
FEPGEVNSIISSRSAVSFPP 

1 621  CGCAAGCAGAGACGUAGACGCAGGAGCAGGAGGACCGAAUACUGACUAACCGGGGUAGGU 
RKQRRRRRSRRTEY*LTGVG 

1681  GGGUACAUAUUUUCGACGGACACAGGCCCUGGGCACUUGCAAAAGAAGUCCGUUCUGCAG 
GYIFSTDTGPGHLQKKSVLQ 

1741  AACCAGCUUACAGAACCGACCUUGGAGCGCAAUGUUCUGGAAAGAAUCUACGCCCCGGUG 
NQLTEPTLERNVLERIYAPV 

1801  CUCGACACGUCGAAAGAGGAACAGCUCAAACUCAGGUACCAGAUGAUGCCCACCGAAGCC 
LDTSKEEQLKLRYQMMPTEA 

1861  AACAAAAGCAGGUACCAGUCUAGAAAAGUAGAAAAUCAGAAAGCCAUAACCACUGAGCGA 
NKSRYQSRKVENQKAITTER 

1921  CUGCUUUCAGGGCUACGACUGUAUAACUCUGCCACAGAUCAGCCAGAAUGCUAUAAGAUC 
LLSGLRLYNSATDQPECYKI 

1981  ACCUACCCGAAACCAUCGUAUUCCAGCAGUGUACCGGCGAACUACUCUGACCCAAAGUUU 
TYPKPSYSSSVPANYSDPKF 

2041  GCUGUAGCUGUUUGCAACAACUAUCUGCAUGAGAAUUACCCGACGGUAGCAUCUUAUCAG 
AVAVCNNYLHENYPTVASYQ 

2101  AUCACCGACGAGUACGAUGCUUACUUGGAUAUGGUAGACGGGACAGUCGCUUGUCUAGAU 
ITDEYDAYLDMVDGTVACLD 

21 61  ACUGCAACUUUUUGCCCCGCCAAGCUUAGAAGUUACCCGAAAAGACACGAGUAUAGAGCC 
TATFCPAKLRSYPKRHEYRA 

2221  CCAAACAUCCGCAGUGCGGUUCCAUCAGCGAUGCAGAACACGUUGCAAAACGUGCUCAUU 
PNIRSAVPSAMQNTLQNVLI 

2281  GCCGCGACUAAAAGAAACUGCAACGUCACACAAAUGCGUGAAUUGCCAACACUGGACUCA 
AATKRNCNVTQMRELPTLDS 

2341  GCGAC AUUCAACGUUGAAUGCUUUCGAAAAUAUGCAUGUAAUGACGAGUAUUGGGAGGAG 
ATFNVECFRKYACNDEYWEE 

2401  UUUGCCCGAAAGCCAAUUAGGAUCACUACUGAGUUCGUUACCGCAUACGUGGCCAGACUG 
FAR  KPIRITTEFVTAYVARL 

2461  AAAGGCCCUAAGGCCGCCGCACUGUUCGCAAAGACGCAUAAUUUGGUCCCAUUGCAAGAA 
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Figure  4.  nsP3/nsP4  of  Girdwood  (1963,  South  Africa,  human) 


KGPKAAALFAKTHNLVPLQE 

2521  GUGCCUAUGGAUAGGUUCGUCAUGGACAUGAAAAGAGACGUGAAAGUUACACCUGGCACG 
VPMDRFVMDMKRDVKVTPGT 

2581  AAACACACAGAAGAAAGACCGAAAGUACAAGUGAUACAAGCCGCAGAACCCCUGGCGACC 
KHTEERPKVQVIQAAEPLAT 

2641  GCUUACCUGUGCGGGAUCCACCGGGAGUUAGUGCGCAGGCUUACAGCCGUCUUGCUACCC 
AYLCGIHRELVRRLTAVLLP 

2701  AACAUUCACACGCUUUUUGACAUGUCGGCGGAGGACUUUGAUGCAAUCAUAGCAGAACAC 
NIHTLFDMSAEDFDAIIAEH 

2761  UUCAAGCAAGGUGACCCGGUACUGGAGACGGAUAUCGCCUCGUUCGACAAAAGCCAAGAC 
FKQGDPVLETD  IASFDKSQD 

2821  GACGCUAUGGCGUUAACUGGCCUGAUGAUCUUGGAAGACCUGGGUGUGGACCAACCACUA 
DAMALTGLMILEDLGVDQPL 

2881  CUCGACUUGAUCGAGUGCGCCUUUGGAGAAAUAUCAUCCACCCAUCUGCCCACGGGUACC 
LDLIECAFGEISSTHLPTGT 

2941  CGUUUCAAAUUCGGGGCGAUGAUGAAAUCCGGAAUGUUCCUCACGCUCUUUGUCAACACA 
RFKFGAMMKSGMFLTLFVNT 

3001  GUUCUGAAUGUCGUUAUCGCCAGCAGAGUAUUGGAGGAGCGGCUUAAAACGUCCAAAUGU 
VLNVVIASRVLEERLKTSKC 

3061  GCAGCAUUUAUCGGCGACGACAACAUCAUACACGGAGUAGUAUCUGACAAAGAAAUGGCU 
AAFIGDDNIIHGVVSDKEMA 

3121  GAGAGGUGUGCCACCUGGCUCAACAUGGAGGUUAAGAUCAUUGACGCAGUCAUCGGCGAG 
ERCATWLNMEVKI  IDAVIGE 

3181  AGACCGCCUUACUUCUGCGGUGGAUUCAUCUUGCAAGAUUCGGUUACCUCCACAGCGUGU 
RPPYFCGGFILQDSVTSTAC 

3241  CGCGUGGCGGACCCCUUGAAAAGGCUGUUUAAGUUGGGUAAACCGCUCCCAGCCGACGAC 
RVADPLKRLFKLGKPLPADD 

3301  GAGCAAGACGAAGACAGAAGACGCGCUCUGCUAGAUGAAACAAAGGCGUGGUUUAGAGUA 
EQDEDRRRALLDETKAWFRV 

3361  GGUAUAACAGACACCUUAGCAGUGGCCGUGGCAACUCGGUAUGAGGUAGACAACAUCACA 
GITDTLAVAVATRYEVDNIT 

3421  CCUGUCCUGCUGGCAUUGAGAACUUUUGCCCAGAGCAAAAGAGCAUUUCAAGCCAUCAGA 
PVLLALRTFAQSKRAFQAIR 

3481  GGGGAAAUAAAGCAUCUCUACGGUGGUCCUAAAUAG  3516 
GEIKHLYGGPK* 
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Figure  4.  Translated  sequence  of  the  nsP3-nsP4  region  of  the  Girdwood  strain  of 
Sindbis  isolated  in  South  Africa.  Conventions  are  the  same  as  in  Figure  2. 
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Figure  5.  nsP3/nsP4  of  Whataroa  M78  (1962,  New  Zealand,  mosquito  pool) 


1  GCGCCAUCGUACAAAUCAAGGAGAGGAAACAUCAUCGAAUGCACCGAAGAAGCCGUCGUG  60 
APSYKSRRGNI  IECTEEAVV 

61  AACGCUGCCAACGCACUAGGACGCCCCGGAGAAGGGGUCUGCAAGGCGAUUUACAAGAAG  120 

NAANALGRPGEGVCKAIYKK 

121  UGGCCGAACAGCUUCACCGGUUCCGCAACAGAAGUAGGGACUGCAAAAAUGACCACAAGC  180 

WPNSFTGSATEVGTAKMTTS 

181  CUAGGCAAGAAAGUCAUACAUGCCGUCGGACCGGAUUUUAAGAAGCACUCUGAAGAAGAA  240 

LGKKVIHAVGPDFKKHSEEE 

241  GCCCUUAAACUGCUGCAGAAUGCCUACCACGCCAUCGCAGAUAUUAUUAAUGAGAACAAC  300 

ALKLLQNAYHAIADI  INENN 

301  AUCAAAUCAGUGGCCAUUCCAUUGCUAUCAACUGGUAUAUACGCUGCAGGGAAGGACAGA  360 

IKSVAIPLLSTGIYAAGKDR 

361  CUAGAGACUUCUUUGCACUGUUUGACCACAGCGAUGGACAGGACGGACGCCGACGUAACG  420 

LETSLHCLTTAMDRTDADVT 

421  GUAUACUGCCUUGACAAGAAAUGGCAGCAGCGAAUUGACGCAGUCCUUAGAUUGAAAGAA  480 

VYCLDKKWQQRIDAVLRLKE 

481  GAGGUAACGGAGCUAAAAGACGACGACAUGGAAAUUGAUGAGGAGCUGGUUUGGAUCCAC  540 

EVTELKDDDMEIDEELVWIH 

541  CCUGACAGCUGUUUGAAAGGACGUAAAGGCUUUAGCACCACCAAAGGCAAACUGUAUUCA  600 

PDSCLKGRKGFSTTKGKLYS 

601  UACUUCGAAGGAACUAAAUUUCACCAGGCAGCGAAAGACAUGGCAGAAAUCAAUGUAUUG  660 

YFEGTKFHQAAKDMAEINVL 

661  UUUCCAGACACCAUUGAGGCUAACGAGCAAAUCUGUAUGUAUAUCCUUGGAGAAAGCAUG  720 

FPDTIEANEQICMYILGESM 

721  GAAGCUAUCCGCGAAAAAUGCCCCGUCGACUACAACCCUUCGUCAAGUCCGCCCAAAACC  780 

EAIREKCPVDYNPSSSPPKT 

781  UUACCCUGCCUGUGCAUGUAUGCUAUGACACCUGAGAGGGUGCAUAGACUCAGAAGCAAC  840 

LPCLCMYAMTPERVHRLRSN 

841  AAUGUCAAAGAAAUUACGGUAUGCUCCUCGACUCCACUUCCAAAACAUAAAAUCAAGAAC  900 

NVKEITVCSSTPLPKHKIKN 

901  GUACAACGAAUCCAGUGUUCAAAAAUCGUCUUGUUUAAUCCCCAGACUCCAGCUUUUGUA  960 

VQRIQCSKIVLFNPQTPAFV 

961  CCUGCACGUAAGUUCAUAGAAACCGAACCCAAAGAAACAGAAGACGAUGCGGCUCAGCCG  1020 

PARKFIETEPKETEDDAAQP 

1021  GACCCGACACCGGUAGUGCAGGCGAGUGUUUCGACCCCGGUCCCACAACGUCAGCAAGAC  1080 

DPTPVVQASVSTPVPQRQQD 

1081  CCGUUAGAGUUGAUAAUAUCCGCAGACUCUUUAACCGAAGUAAACGACACCUCUGACGAC  1140 

PLELIISADSLTEVNDTSDD 

1141  AUUUCCGACAUACCCUUUGACACAUCUGUAUAUGCUAGUACUUCCUCACUGAGCUCGGUU  1200 

ISDIPFDTSVYASTSSLSSV 

1201  UUGGACUGCCACAAUGUAGUCGAGGUCGAGGCGGAAAUUCACGUCGUCCCGCAGACUCCG  1260 

LDCHNVVEVEAEI  HVVPQTP 
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Figure  5.  nsP3/nsP4  of  Whataroa  M78  (1962,  New  Zealand,  mosquito  pool) 


1261  GUGGCACCGCCGAGAAAGAAGAAGUUAGCACGUUUAGCGGCGCUAUCAAGAGCAUCUAGC  1320 
VAPPRKKKLARLAALSRASS 

1321  AUUUCCUCCAUCGAAUCCAACCCACCAAUCACUUUUGGAUCAUUUGAGGAUGGAGAAAUA  1380 

ISSIESNPPITFGSFEDGEI 

1381  GACAACUUGCAGAAGAAGUGCACUUCAGAACCAUUUAUGUUCGGCUCGUUCGAACCAGGC  1440 

DNLQKKCTSEPFMFGSFEPG 

1441  GAAGUCAACAGCCUGAUAGAAACCAGGUCGGAGCCACCACGUAGGGGGCGCAGACGUCGC  1500 

EVNSLIETRSEPPRRGRRRR 

1501  AACAAGAACCGACAGGAGUAUUGACUAACCGGGGUAGGUGGGUACAUCUUCUCGACGGAC  1560 

NKNRQEY*LTGVGGYIFSTD 

1561  ACUAAUGAAGGACACCUCCAGAAGAAAUCGGUACUCCAAAAUGAUCUGGCAGUCACCAUU  1620 

TNEGHLQKKSVLQNDLAVTI 

1621  UUAGAACGGAACAUAUUGGAAAAAGUCCAUGCACCCGUGUACAACGCUGAAAAAGAGGAG  1680 

LERNILEKVHAPVYNAEKEE 

1681  AUACUGAAAAUGAAGUACCAGAUGAUGCCCACCGAAACCAACAAGAGUCGGUACCAAUCG  1740 

ILKMKYQMMPTETNKSRYQS 

1741  AGAAAAGUAGAAAAUCAAAAAGCAGUAACUACCCAACGUCUAUUAUCAGGACUGAAACUU  1800 

RKVENQKAVTTQRLLSGLKL 

1801  UAUACAUAUGAGCCUAACCAACCGGAGUGCUACAAAACCACAUAUCCGAGACCAUUGUAU  1860 

YTYEPNQPECYKTTYPRPLY 

1861  UCUAGUAGCAUACCAGUUAGUUACGAUAGCGCACAAGUGGCGGUCGCAGUGUGCAAUAAC  1920 

SSSIPVSYDSAQVAVAVCNN 

1921  UACCUGCAUGAAAACUAUCCGACUGUCGCAUCUUACCAGAUUACCGACGAGUACGACGCU  1980 

YLHENYPTVASYQITDEYDA 

1981  UACCUAGACAUGGUGGAUGGCGCUGUCGCUUGUCUGGACACAGCUACAUUUUGUCCAGCU  2040 

YLDMVDGAVACLDTATFCPA 

2041  AAGCUCAGGAGCUUCCCGAAGAAGCAUGAAUAUAAGACUCCCGAAAUUCGCAGCGCUGUC  2100 

KLRSFPKKHEYKTPEIRSAV 

2101  CCCUCCGCCAUGCAGAACACACUACAGAAUGUACUCAUUGCCGCGACGAAACGAAACUGC  2160 

PSAMQNTLQNVLIAATKRNC 

2161  AACGUUACUCAGAUGCGAGAAUUACCAACAUUGGAUUCAGCGACUUUUAACGUGGAAUGC  2220 

NVTQMRELPTLD  SATFNVEC 

2221  UUCAAAAAAUUUGCGUGUAAUGACGAGUACUGGAGCGAAUUUCGUGACAAACCCAUCAGA  2280 

FKKFACNDEYWSEFRDKPIR 

2281  AUAACAACCGAAUTJCGUUACCUCGUACGUAGCGCGACUAAAAGGACCAAAGGCAGCGGCG  2340 

ITTEFVTSYVARLKGPKAAA 

2341  UUGUUCGCAAAAACUCAUAACCUAGUUCCCUUGCAAGAAGUUCCUAUGGAUAGGUUUGUC  2400 

LFAKTHNLVPLQEVPMDRFV 

2401  AUGGACAUGAAGAGGGACGUUAAAGUCACACCCGGAACAAAACACACAGAAGAGAGACCA  2460 

MDMKRDVKVTPGTKHTEERP 

2461  AAAGUCCAAGUCAUCCAGGCCGCUGAGCCGCUAGCUACCGCAUACUUAUGCGGAAUCCAC  2520 

KVQVIQAAEPLATAYLCGIH 
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Figure  5.  nsP3/nsP4  of  Whataroa  M78  (1962,  New  Zealand,  mosquito  pool) 


2521  CGAGAACUGGUUAGGAGGCUGACUGCUGUACUACUUCCGAACAUUCACACCCUGUUCGAU  2580 

RELVRRLTAVLLPNI  HTLFD 

2581  AUGUCGGCCGAAGAUUUUGACGCUAUCAUAGCUGAACAUUUCAACUAUGGGGACCCUGUC  2640 

MSAEDFDAI  IAEHFNYGDPV 

2641  UUAGAAACCGACAUCGCGUCGUUCGACAAAAGUCAGGACGACGCCAUGGCCCUGACCGGC  2700 

LETDIASFDKSQDDAMALTG 

2701  CUGAUGAUCCUUGAAGACUUGGGUGUCGACCAGCCCCUUUUAGACCUUAUUGAAUGUGCG  2760 

LMILEDLGVDQPLLDLIECA 

2761  UUCGGCGAAAUCUCCUCGACGCAUCUCCCGACAGGUACGAGAUUCAAAUUUGGAUCGAUG  2820 

FGEISSTHLPTGTRFKFGSM 

2821  AUGAAAUCUGGAAUGUUCCUCACCCUGUUUGUCAACACUGUGCUGAAUGUUGUAAUCGCC  2880 

MKSGMFLTLFVNTVLNVVIA 

2881  AGCAGGGUCCUAGAGCAUAGACUGAAAGAGUCGCGAUGCGCCGCAUUCAUCGGUGAUGAC  2940 

SRVLEHRLKESRCAAFIGDD 

2941  AACAUAAUACACGGCGUAGU  GUCUGACAAGGAAAU  GGC AG AAAGAUGCGCUACCUGGCUU  3000 

NI  I  HGVVSDKEMAERCATWL 

3001  AACAUGGAAGUGAAGAUCAUCGACGCCGUCAUAGGCAUCAGACCUCCAUAUUUUUGUGGU  3060 

NMEVKIIDAVIGIRPPYFCG 

3061  GGAUUCAUCCUUCAAGAUGAGACGACAUUAACCACAUGUCGCGUCGCCGAUCCGCUUAAG  3120 

GFILQDETTLTTCRVADPLK 

3121  AGGCUCUUUAAACUAGGUAAACCACUACCCGCGGAGGACACGCAAGAUGAAGACAGAAGA  3180 

RLFKLGKPLPAEDTQDEDRR 

3181  CGUGCCCUUAUGGACGAAACCAAAGCAUGGUUCCGGGUAGGAAUUAGGAACACUCUCGCA  3240 

RALMDETKAWFRVGIRNTLA 

3241  GUUGCCGUAUCGACCAGGUACGAGGUAGAAGAUAUUACACCCGUUCUAUACGCGCUUAGA  3300 

VAVSTRYEVEDITPVLYALR 

3301  ACAUUCGCUCAAAGCAAAAAGGCCUUCCAGACUAUACGAGGAGAAAUAAGACAGCUCUAC  3360 

TFAQSKKAFQTIRGEIRQLY 

3361  GGCGGUCCUAAAUAG  3375 

G  G  P  K  * 


Figure  5.  Translated  sequence  of  the  nsP3-nsP4  regions  of  Whataroa  virus, 
isolated  in  New  Zealand.  It  is  clear  from  this  sequence  that  Whataroa  virus  is 
closely  related  to  Sindbis  virus. 
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Figure  6.  Percent  amino  acid  differences  between  the  different  isolates  of 
Sindbis  virus  in  two  regions  of  the  nonstructural  proteins. 
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development  of  techniques  for  construction  of  such  a  library.  The  details  of  the  methods 
developed  are  presented  in  the  Methods  section  and  required  a  careful  attention  to  detail  in  order  to 
obtain  a  random  library.  With  the  automated  DNA  sequencer,  24  DNA  samples  can  be  analyzed  at 
one  time  and  each  sample  can  be  read  automatically  to  more  than  400  nucleotides.  Thus  about 
10  kb  of  sequence  is  obtained  from  a  single  run.  To  obtain  the  complete  sequence  of  a  viral  RNA 
requires  over-sequencing  of  the  genome  because  of  compression  artifacts  and  occasional 
misreading  by  the  machines  and  a  slightly  nonrandom  distribution  of  the  sequences  obtained.  The 
procedure  developed  as  the  most  efficient  is  to  over-sequence  about  three-fold,  that  is  to  obtain 
about  30  kb  of  sequence  for  the  12  kb  RNA,  align  this  sequence  using  computer  programs  and 
using  the  homology  between  different  alphaviruses,  and  then  to  fill  in  any  gaps  that  might  still  exist 
by  designing  PCR  primers  that  can  be  used  to  obtain  double-stranded  cDNA  for  the  missing 
segments  and  sequence  this  DNA  manually.  Fig.  7  illustrates  the  distribution  of  cDNA  clones 
obtained  using  the  technology  developed  for  Whateroa  virus  and  shows  the  random  nature  of  the 
clones  obtained.  Fig.  8  illustrates  sequence  output  from  the  Applied  Biosystems  sequenator  for 
one  clone  (the  original  output  is  in  four  colors,  a  different  color  for  each  of  the  four  nucleotides, 
which  aids  in  interpreting  the  data).  This  sequence  is  automatically  recorded  in  a  computer  file. 
Fig.  9  shows  the  DNA  sequence  obtained  from  this  clone  of  Whateroa  virus  and  compares  the 
sequence  to  that  of  Sindbis  virus.  The  technology  is  highly  suitable  to  obtaining  large  amounts  of 
sequence  from  alphaviruses  and  makes  it  conceptually  feasible  to  examine  a  iarge  number  of 
different  alphaviruses  or  of  strains  of  one  alphavirus  isolated  from  different  locations  of  the  world 
in  order  to  examine  the  relationships  of  the  viruses  to  one  another.  The  sequence  of  the  nsP3-nsP4 
region  of  Whataroa  virus  obtained  by  this  method  was  shown  in  Fig.  5. 

Mapping  of  a  Neutralizing  Antibody-Binding  Site  in  Glycoprotein  E2 

A  Xgtl  1  library  containing  randomly  generated  100-300  base  pair  Sindbis  cDNA  inserts  in 
the  lacZ  gene  was  tested  for  reactivity  with  6  mAbs,  using  ,25I-protein  G  to  detect  the  presence  of 
mAb  (all  were  IgGs)  bound  to  immunoreactive  phage  clones  on  nitrocellulose  filters.  Four 


genome,  hach  line  below  represents  a  cloned  insert  which  has  been 
sequence  with  that  of  Sindbis  virus. 


Figure  8.  Automated  sequence  analysis  of  clone  NZ430  of  Whataroa  virus. 
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2200  CGGTCCCGTACAAGGTCGAAACAATAGGAGTGATAGGCACACCGGGGTCG  2249 

:  II  I  III 

1  . NGGTATACGACTCACTATAGGGCGAAT  27 

2250  GGCAAGTCAGCTATTATCAAGTCAACTGTCACGGCACGAGATCTTGTTAC  2299 

I  I  I  II  I  I  II  I  I  I  I  I  I  II  I  I  II  I  I  I  I  I  I  II  I 

28  TCCGAATCCGCAATCATTAAAAACATCGTCACTACCAGGGATCTTGTGAC  77 

2300  CAGCGGAAAGAAAGAAAATTGTCGCGAAATTGAGGCCGACGTGCTAAGAC  2349 
I  I  II  I  I  I  I  II  I  I  II  I  I  I  I  II  II  II  I  I  I  II  II  II  I  I  I  II  I  II 
78  CAGCGGAAAGAAAGAAAACTGCCGGGAAATAGAAGCTGACGTCCTCAAAC  127 

2350  TGAGGGGTATGCAGATTACGTCGAAGACAGTAGATTCGGTTATGCTCAAC  2399 
I  I  I  I  I  I  II  II  I  I  I  I  I  II  II  II  III  I  I  I  I  II 
128  ACCGAAAAATGCAAATCGTTTCAAAGACGGTCGACTCCGTTTTGCTTAAT  177 

2400  GGATGCCACAAAGCCGTAGAAGTGCTGTACGTTGACGAAGCGTTCGCGTG  2449 

I  I  I  I  I  I  I  II  I  I  I  I  I  I  I  I  I  I  1  I  I  I  I  I  I  I  I  II  II  II  I  I  I 

178  GGTTGCCACAAGTCAGTCGACATCCTGTATGTCG . CGAAGCTTACGCGTG  226 

2450  CCACGCAGGAGCACTACTTGCCTTGATTGCTATCGTCAGGCCCCGCAAGA  2499 
I  I  I  II  II  I  I  I  I  I  I  I  I  I  I  I  I  II  II  I  I  I  I  I  II  I  I  I  I 

227  CCACGCTGGCACCCTATTGGCCTTAATCGCCATAGTCCGACCTAGAAATA  276 

2500  AGGTAGTACTATGCGGAGACCCCATGCAATGCGGATTCTTCAACATGATG  2549 

I  II  II  II  I  I  I  II  I  I  I  I  I  I  II  II  II  II  I  I  II  II  II  I  I  II  I 

277  AAGTGGTCCTATGTGGCGACCCAAAACAGTGTGGTTTCTTCAACATGATG  326 

2550  CAACTAAAGGTACATTTCAATCACCCTGAAAAAGACATATGCACCAAGAC  2599 

II  II  II  I  I  I  II  II  II  I  I  I  I  I  I  I  I  I  I  I  II  I  I  I  II  I  :  I  I  I 

327  CAGCTGAAGGTCCACTTTAACGACCCTGAACGCGACATTTGCACGANGAC  376 

2600  ATTCTACAAGTATATCTCCCGGCGTTGCACACAGCCAGTTACAGCTATTG  2649 
I  I  I  I  I  I  I  I  II  II  II  II  II  II  I  II  II  II  II  I  II  II  II  I  I 

377  GTTCTACAAATACATTTCTCGTCGGTGCACGCAACCGGTGACAGCAATTG  426 

2650  TATCGACACTGCATTACGATGGAAAGATGAAAACCACGAACCCGTGCAAG  2699 

I  I  I  I  I  I  II  I  I  II  I  I  II  II  I  I  I  :  II  I  I  II  I  I  I  I  I  I  I  I 

427  TGTCTACACTGCACTA . TAACGAAAAATGCGNACCACCAACCCATGTAAC  475 

2700  AAGAACATTGAAATCGATATTACAGGGGCCACAAAGCCGAAGCCAGGGGA  2749 

II  II  I  I  I  I  II  I  II  I  :  I  I  I  II  I  I  I  I  II  I  II  II 

476  AAGAACATCGTAATCGNCATTACCGGACAAACCAACCAAAACAGGGGTAT  525 

2750  TATC . ATCCTGACATGTTTCCGCGGGTGGGTTAAGCAATTGCAAA  2793 

III  III  I  I  I  I  I  II  I  I  I  I  I  II  I 

526  TTTCTGACGTGTTCAGGGGTTGGTCAGCAGTTCAGTTGATACCAGGCAGA  575 

2794  TCGACTATCCCGGACATGAAGTAATGACAGCCGCGGCCTCACAAGGGCTA  2843 

:  I  I  I  :  :  I  II  : 

576  GTTTTNCTCGGAGTTNNAAGGTTNC .  600 


Figure  9.  Comparison  of  the  sequence  of  clone  NZ430  of  Whataroa  virus 
(from  data  in  Figure  8)  with  the  sequence  of  Sindbis  virus.  Vertical  lines 
between  nucleotides  highlight  identity;,  vertical  dots  show  ambiguities. 
Underlined  sequence  is  that  of  the  2£coRI  linker  used  to  construct  the 
clone;  sequence  upstream  of  this  is  vector. 
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positive  phage  clones  were  identified  when  mAb  23  was  used  to  screen  106  plaques,  designated 
X23a,  X23b,  X23c  and  X23d.  Results  with  two  of  these  clones  are  illustrated  in  Fig.  10;  also 
shown  is  a  control  in  which  a  nonreactive  region  of  E2  is  present  as  an  insert  in  the  Xgtl  1  clone. 
These  four  phages  were  plaque  purified  and  DNA  prepared  from  each  (Young  and  Davis,  1983). 
The  inserts  were  removed  with  EcoRI,  subcloned  into  vector  M13mpl8,  and  sequenced  by  the 
dideoxy  chain  termination  procedure  (Sanger  et  al.,  1977).  The  four  inserts  contained  overlapping 
sequences  from  the  central  region  of  glycoprotein  E2  (Fig.  11).  The  insert  in  X23a  comprised  E2 
residues  155-258,  that  in  X23b  comprised  residues  173-251,  that  in  X23c  145-223,  and  that  in 
X23d  169-220.  Thus  the  domain  from  residues  173  to  220  is  present  in  all  four  inserts,  and  the 
neutralizing  epitope  recognized  by  mAb  23  must  lie  within  this  region.  It  is  of  note  that  this 
overlap  region  is  2-3  fold  larger  that  the  15-22  amino  acid  residues  found  to  contact  antibody  in 
epitopes  defined  by  X-ray  diffraction  analysis  (Laver  et  al.,  1990),  and  it  is  conceivable  that  the 
epitope  could  be  formed  by  a  folded  structure  with  contributions  from  residues  throughout  this 
region. 


We  also  attempted  to  identify  fusion  proteins  immunoreactive  with  four  other  E2-specific 
neutralizing  mAbs,  namely  mAbs  18,  50,  51,  and  49,  as  well  as  fusion  proteins  immunoreactive 
with  mAb  33,  specific  for  glycoprotein  El.  In  each  case  106  plaques  were  screened.  No  positive 
plaques  could  be  identified  with  any  of  these  antibodies.  We  concluded  that  these  antibodies 
probably  react  with  conformational  epitopes  not  present  in  the  Xgtl  1  library,  either  because  these 
epitopes  are  discontinuous  or  consist  of  conformations  not  assumed  by  the  fusion  proteins. 

Conclusions 

Rapid  Sequencing  of  Virus  RNAs.  High  throughput  automated  DNA  sequencing  is 
ideally  suited  to  obtaining  large  amounts  of  sequence  data  for  strains  of  alphaviruses  or  for  other 
viruses.  The  methods  that  we  have  developed  can  be  used  for  any  RNA  virus  and  are  suitable  to 
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Figure  10.  Reactivities  of  phage  clones  A,  23a  and  A.  23b  with  MAb  23. 
Immunoreactive  phage  plaques  were  picked  and  rescreened  until  a  uniformly 
reactive  population  was  obtained.  Illustrated  are  the  final  populations  for  two 
reactive  clones  and  a  nonreactive  clone  expressing  amino  acids  129  -  192  of  E2 
(control).  Phage  stocks  were  plated  on  E.  coli  Y1090  in  the  presence  of  the  inducer 
and  the  plaques  transferred  to  nitrocellulose.  Filters  were  incubated  with  MAb 
23  followed  by  1251  conjugated  protein  G  and  autoradiographed.  Comparison  of 
the  autoradiogram  with  the  pattern  of  plaques  on  the  petri  plate  showed  that  all  of 
the  A,23a  and  A.23b  plaques  reacted  with  the  antibody. 
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Figure  11.  Schematic  representation  of  an  antigenically  important  domain  of  Sindbis  virus  glycoprotein  E2.  The  relative 
locations  of  the  inserts  in  four  Xgtll  clones  reactive  with  MAb  23  are  mapped.  The  overlap  region  in  these  four  clones 
between  residues  173  and  220  of  E2  is  expanded  below,  with  a  number  of  key  features  indicated.  Residues  altered  in 
variants  resistant  to  MAbs  are  boxed  and  a  carbohydrate  attachment  site  is  indicated  with  a  stalked  symbol  (CHO). 
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obtaining  sequence  information  for  different  viruses  belonging  to  a  virus  group  or  to  obtaining 
sequence  for  strains  of  the  same  virus  isolated  from  different  geographic  regions.  This  makes  it 
feasible  to  examine  a  large  number  of  isolates  and  therefore  to  determine  the  relationships  among  a 
group  of  viruses  or  to  search  for  emergent  viruses  that  differ  in  certain  fundamental  ways  from 
other  members  of  the  group.  The  sequences  presented  in  this  report  are  an  example  of  what  it  is 
possible  to  do.  These  sequences  examine  the  relationships  among  a  number  of  different 
geographic  isolates  of  Sindbis-like  viruses. 


An  Antibody  Binding  Domain  in  E2.  The  Xgtl  1  system  provides  a  rapid,  specific, 
and  sensitive  strategy  for  the  physical  mapping  on  large  viral  genomes  of  the  genes  encoding 
proteins  for  which  antibody  reagents  are  available.  We  used  small  Sindbis  virus  genomic  inserts  in 
an  attempt  to  define  the  immunoreactive  domain  of  the  protein  more  precisely.  The  limitation  of  the 
A.gtl  1  system  is  the  fact  that  these  protein  domains  are  expressed  as  part  of  a  fusion  protein  and 
thus  may  not  fold  in  the  same  way  as  the  native  protein,  and  only  antibodies  that  interact  with 
contiguous  linear  domains  of  the  proteins  of  interest  may  be  reactive  with  phage  plaques. 

From  the  sequence  of  the  inserts  in  the  four  clones  immunoreactive  with  mAb  23,  it  is  clear 
that  this  antibody  can  react  with  a  single  continuous  region  of  the  Sindbis  glycoprotein  E2,  and  that 
the  neutralization  epitope  must  lie  within  the  48  residues  between  amino  acids  173  and  220.  This 
result  is  consistent  with  the  results  from  mapping  of  antibody  escape  variants  resistant  to  mAb  23 
(Fig.  10).  Sequencing  of  3  independent  variants  resistant  to  mAb  23  and  of  2  independent 
revertants  selected  to  be  sensitive  again  to  mAb  23,  as  well  as  of  other  variants,  has  shown  that 
residue  216  is  important  for  reactivity  with  mAb  23  (Strauss  et  al.,  1991).  Virus  with  Lys-216 
were  fully  sensitive  to  mAb  23,  virus  with  Val-216  or  Ile-216  demonstrated  a  reduced  sensitivity  to 
mAb  23,  and  virus  with  Glu-216  were  resistant  to  mAb  23.  From  the  results  obtained  here,  it 
appears  likely  therefore  that  residue  216  interacts  directly  with  mAb  23. 

Although  the  remaining  antibodies  tested  failed  to  react  with  the  Xgtl  1  library,  presumably 
because  they  react  with  conformational  epitopes  not  present  in  the  library,  it  seems  likely  that  the 
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E2-specific  mAbs  50,  51,  49,  and  18  also  bind  to  epitopes  at  least  partially  encompassed  within 
this  same  domain.  Variants  selected  to  be  resistant  to  these  mAbs  were  all  found  to  have  amino 
acid  changes  responsible  for  the  escape  from  neutralization  within  the  domain  from  residues  181  to 
216  (Fig.  1)  (Strauss  et  al.,  1991).  Furthermore,  mAb  23  and  these  mAbs  all  react  with  closely 
spaced  or  overlapping  epitopes  as  defined  by  competition  assays  or  by  the  pattern  of  cross 
reactivity  of  different  variants  resistant  to  the  various  antibodies  (Davis  et  al.,  1987;  Schmaljohn  et 
al.,  1983;  Strauss  et  al.,  1991).  The  results  are  all  consistent  with  the  hypothesis  that  the  E2 
domain  between  173  and  220  forms  a  major  antibody  binding  region  important  for  neutralization  of 
virus  infectivity.  This  domain  is  illustrated  in  Fig.  1 1  with  the  locations  of  antibody  escape 
variants  shown  and  the  region  selected  by  mAb  23  indicated.  This  domain  is  hydrophilic, 
containing  25%  charged  residues,  and  has  a  glycosylation  site  at  Asn-196,  and  thus  is  almost 
certainly  exposed  on  the  surface  of  the  glycoprotein  spike  (Strauss  and  Strauss,  1986). 

We  have  previously  found  that  an  antiidiotypic  antibody  to  mAb  23,  as  well  as  antiidiotypic 
antibodies  to  mAbs  49  and  50,  function  as  antireceptor  antibodies  in  chicken  cells  (Wang  et  al., 
1991).  This  suggests  that  the  E2  domain  defined  by  the  fusion  protein  reactive  with  mAb  23  and 
by  the  antibody  escape  variants  might  form  part  of  the  antireceptor  on  the  virus  spike  that  binds  to 
the  cellular  receptor.  This  hypothesis  is  supported  by  the  observation  that  two  strains  of  Sindbis 
virus  that  differ  only  in  having  Gly  or  Arg  at  residue  172  of  E2  differ  in  their  ability  to  bind  to 
neuroblastoma  cells  in  culture  (Tucker  and  Griffin,  1991)  and  differ  in  their  neurovirulence  for 
mice  (Lustig  et  al.,  1988). 

These  results  make  clear  that  the  region  of  E2  between  residues  170  and  220  contains  a 
number  of  dominant  epitopes,  and  that  this  region  is  a  key  region  for  the  development  of  vaccines. 
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